NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories

Zhang, Zixuan; Chen, Minshuo; Wang, Mengdi; Liao, Wenjing; Zhao, Tuo (July 2025, Proceedings of the 40th International Conference on Machine Learning, PMLR 202:40911-40931, 2023.)

Free, publicly-accessible full text available July 3, 2026
Opportunities and challenges of diffusion models for generative AI

https://doi.org/10.1093/nsr/nwae348

Chen, Minshuo; Mei, Song; Fan, Jianqing; Wang, Mengdi (November 2024, National Science Review)

ABSTRACT Diffusion models, a powerful and universal generative artificial intelligence technology, have achieved tremendous success and opened up new possibilities in diverse applications. In these applications, diffusion models provide flexible high-dimensional data modeling, and act as a sampler for generating new samples under active control towards task-desired properties. Despite the significant empirical success, theoretical underpinnings of diffusion models are very limited, potentially slowing down principled methodological innovations for further harnessing and improving diffusion models. In this paper, we review emerging applications of diffusion models to highlight their sample generation capabilities under various control goals. At the same time, we dive into the unique working flow of diffusion models through the lens of stochastic processes. We identify theoretical challenges in analyzing diffusion models, owing to their complicated training procedure and interaction with the underlying data distribution. To address these challenges, we overview several promising advances, demonstrating diffusion models as an efficient distribution learner and a sampler. Furthermore, we introduce a new avenue in high-dimensional structured optimization through diffusion models, where searching for solutions is reformulated as a conditional sampling problem and solved by diffusion models. Lastly, we discuss future directions about diffusion models. The purpose of this paper is to provide a well-rounded exposure for stimulating forward-looking theories and methods of diffusion models.
more » « less
Full Text Available
Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks

Zhang, Zixuan; Zhang, Kaiqi; Chen, Minshuo; Takeda, Yuma; Wang, Mengdi; Zhao, Tuo; Wang, Yu-Xiang (October 2024, Advances in neural information processing systems)

Convolutional residual neural networks (ConvResNets), though overparameterized, can achieve remarkable prediction performance in practice, which cannot be well explained by conventional wisdom. To bridge this gap, we study the performance of ConvResNeXts, which cover ConvResNets as a special case, trained with weight decay from the perspective of nonparametric classification. Our analysis allows for infinitely many building blocks in ConvResNeXts, and shows that weight decay implicitly enforces sparsity on these blocks. Specifically, we consider a smooth target function supported on a low-dimensional manifold, then prove that ConvResNeXts can adapt to the function smoothness and low-dimensional structures and efficiently learn the function without suffering from the curse of dimensionality. Our findings partially justify the advantage of overparameterized ConvResNeXts over conventional machine learning models.
more » « less
Full Text Available
Theoretical insights for diffusion guidance: A case study for Gaussian mixture models

Wu, Yuchen; Chen, Minshuo; Li, Zihao; Wang, Mengdi; Wei, Yuting (July 2024, International Conference on Machine Learning)
Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

Liu, Hao; Yang, Haizhao; Chen, Minshuo; Zhao, Tuo; Liao, Wenjing (January 2024, Journal of machine learning research)

Learning operators between infinitely dimensional spaces is an important learning task arising in machine learning, imaging science, mathematical modeling and simulations, etc. This paper studies the nonparametric estimation of Lipschitz operators using deep neural networks. Non-asymptotic upper bounds are derived for the generalization error of the empirical risk minimizer over a properly chosen network class. Under the assumption that the target operator exhibits a low dimensional structure, our error bounds decay as the training sample size increases, with an attractive fast rate depending on the intrinsic dimension in our estimation. Our assumptions cover most scenarios in real applications and our results give rise to fast rates by exploiting low dimensional structures of data in operator estimation. We also investigate the influence of network structures (e.g., network width, depth, and sparsity) on the generalization error of the neural network estimator and propose a general suggestion on the choice of network structures to maximize the learning efficiency quantitatively.
more » « less
Full Text Available
Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

Liu, Hao; Yang, Haizhao; Chen, Minshuo; Zhao, Tuo; Liao, Wenjing (January 2024, Journal of Machine Learning Research)

Full Text Available
Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

Liu, Hao; Yang, Haizhao; Chen, Minshuo; Zhao, Tuo; Liao, Wenjing (January 2024, Journal of Machine Learning Research)
Maxim Raginsky (Ed.)
Full Text Available
Deep Nonparametric Estimation of Operators between Infinite Dimensional Spaces

Liu, Hao; Yang, Haizhao; Chen, Minshuo; Zhao, Tuo; Liao, Wenjing (January 2024, Journal of Machine Learning Research)
Raginsky, Maxim (Ed.)
Learning operators between infinitely dimensional spaces is an important learning task arising in machine learning, imaging science, mathematical modeling and simulations, etc. This paper studies the nonparametric estimation of Lipschitz operators using deep neural networks. Non-asymptotic upper bounds are derived for the generalization error of the empirical risk minimizer over a properly chosen network class. Under the assumption that the target operator exhibits a low dimensional structure, our error bounds decay as the training sample size increases, with an attractive fast rate depending on the intrinsic dimension in our estimation. Our assumptions cover most scenarios in real applications and our results give rise to fast rates by exploiting low dimensional structures of data in operator estimation. We also investigate the influence of network structures (e.g., network width, depth, and sparsity) on the generalization error of the neural network estimator and propose a general suggestion on the choice of network structures to maximize the learning efficiency quantitatively.
more » « less
Full Text Available
High Dimensional Binary Classification under Label Shift: Phase Transition and Regularization

Cheng, Jiahui; Chen, Minshuo; Liu, Hao; Zhao, Tuo; Liao, Wenjing (October 2023, Sampling Theory, Signal Processing, and Data Analysis)

Full Text Available
Score Approximation, Estimation and Distribution Recovery of Diffusion Models on Low-Dimensional Data

Chen, Minshuo; Huang, Kaixuan; Zhao, Tuo; Wang, Mengdi (July 2023, Proceedings of Machine Learning Research)

Diffusion models achieve state-of-the-art performance in various generation tasks. However, their theoretical foundations fall far behind. This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace. Our result provides sample complexity bounds for distribution estimation using diffusion models. We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated. Further, the generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution. The convergence rate depends on subspace dimension, implying that diffusion models can circumvent the curse of data ambient dimensionality.
more » « less
Full Text Available

« Prev Next »

Search for: All records